Stochastic optimization and sparse statistical recovery: Optimal algorithms for high dimensions
نویسندگان
چکیده
We develop and analyze stochastic optimization algorithms for problems in which the expected loss is strongly convex, and the optimum is (approximately) sparse. Previous approaches are able to exploit only one of these two structures, yielding a O(d/T ) convergence rate for strongly convex objectives in d dimensions and O( √ s(log d)/T ) convergence rate when the optimum is s-sparse. Our algorithm is based on successively solving a series of l1-regularized optimization problems using Nesterov’s dual averaging algorithm. We establish that the error of our solution after T iterations is at most O(s(log d)/T ), with natural extensions to approximate sparsity. Our results apply to locally Lipschitz losses including the logistic, exponential, hinge and least-squares losses. By recourse to statistical minimax results, we show that our convergence rates are optimal up to constants. The effectiveness of our approach is also confirmed in numerical simulations where we compare to several baselines on a least-squares regression problem.
منابع مشابه
Pathwise Coordinate Optimization for Sparse Learning: Algorithm and Theory
The pathwise coordinate optimization is one of the most important computational frameworks for high dimensional convex and nonconvex sparse learning problems. It differs from the classical coordinate optimization algorithms in three salient features: warm start initialization, active set updating, and strong rule for coordinate preselection. Such a complex algorithmic structure grants superior ...
متن کاملEstimation, Optimization, and Parallelism when Data is Sparse
We study stochastic optimization problems when the data is sparse, which is in a sense dual to current perspectives on high-dimensional statistical learning and optimization. We highlight both the difficulties—in terms of increased sample complexity that sparse data necessitates—and the potential benefits, in terms of allowing parallelism and asynchrony in the design of algorithms. Concretely, ...
متن کاملPathwise Coordinate Optimization for Sparse
The pathwise coordinate optimization is one of the most important computational frameworks for high dimensional convex and nonconvex sparse learning problems. It differs from the classical coordinate optimization algorithms in three salient features: warm start initialization, active set updating, and strong rule for coordinate preselection. Such a complex algorithmic structure grants superior ...
متن کاملEstimation, Optimization, and Parallelism when Data is Sparse or Highly Varying
We study stochastic optimization problems when the data is sparse, which is in a sensedual to the current understanding of high-dimensional statistical learning and optimization.We highlight both the difficulties—in terms of increased sample complexity that sparse datanecessitates—and the potential benefits, in terms of allowing parallelism and asynchrony in thedesign of alg...
متن کاملMulti-Step Stochastic ADMM in High Dimensions: Applications to Sparse Optimization and Matrix Decomposition
In this paper, we consider a multi-step version of the stochastic ADMM method with efficient guarantees for high-dimensional problems. We first analyze the simple setting, where the optimization problem consists of a loss function and a single regularizer (e.g. sparse optimization), and then extend to the multi-block setting with multiple regularizers and multiple variables (e.g. matrix decompo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012